New Sparse Matrix Storage Format to Improve The Performance of Total SPMV Time

نویسندگان

Neelima Reddy

Raghavendra Prakash

Ram Mohana Reddy

چکیده

Graphics Processing Units (GPUs) are massive data parallel processors. High performance comes only at the cost of identifying data parallelism in the applications while using data parallel processors like GPU. This is an easy effort for applications that have regular memory access and high computation intensity. GPUs are equally attractive for sparse matrix vector multiplications (SPMV for short) that have irregular memory access. SPMV is an important computation in most of the scientific and engineering applications and scaling the performance, bandwidth utilization and compute intensity (ratio of computation to the data access) of SPMV computation is a priority in both academia and industry. There are various data structures and access patterns proposed for sparse matrix representation on GPUs and optimizations and improvements on these data structures is a continuous effort. This paper proposes a new format for the sparse matrix representation that reduces the data organization time and the memory transfer time from CPU to GPU for the memory bound SPMV computation. The BLSI (Bit Level Single Indexing) sparse matrix representation is up to 204% faster than COO (Co-ordinate), 104% faster than CSR (Compressed Sparse Row) and 217% faster than HYB (Hybrid) formats in memory transfer time from CPU to GPU. The proposed sparse matrix format is implemented in CUDA-C on CUDA (Compute Unified Device Architecture) supported NVIDIA graphics cards.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Implementing Sparse Matrix-Vector Multiplication with QCSR on GPU

We are going through the computation from single core to multicore architecture in parallel programming. Graphics Processor Units (GPUs) have recently emerged as outstanding platforms for data parallel applications with regular data access patterns. However, it is still challenging to optimize computations with irregular data access patterns like sparse matrix-vector multiplication (SPMV). SPMV...

متن کامل

Accelerating Sparse Matrix Vector Multiplication on Many-Core GPUs

Many-core GPUs provide high computing ability and substantial bandwidth; however, optimizing irregular applications like SpMV on GPUs becomes a difficult but meaningful task. In this paper, we propose a novel method to improve the performance of SpMV on GPUs. A new storage format called HYB-R is proposed to exploit GPU architecture more efficiently. The COO portion of the matrix is partitioned ...

متن کامل

A new approach for sparse matrix vector product on NVIDIA GPUs

The sparse matrix vector product (SpMV) is a key operation in engineering and scientific computing and, hence, it has been subjected to intense research for a long time. The irregular computations involved in SpMV make its optimization challenging. Therefore, enormous effort has been devoted to devise data formats to store the sparse matrix with the ultimate aim of maximizing the performance. G...

متن کامل

The sparse matrix vector product on GPUs

The sparse matrix vector product (SpMV) is a paramount operation in engineering and scientific computing and, hence, has been a subject of intense research for long. The irregular computations involved in SpMV make its optimization challenging. Therefore, enormous effort has been devoted to devise data formats to store the sparse matrix with the ultimate aim of maximizing the performance. The G...

متن کامل

Applications of the streamed storage format for sparse matrix operations

The streamed storage format for sparse matrices showed good performance improvement for sparse matrix and vector multiply (SpMV) compared with compressed sparse row (CSR) and block CSR (BCSR) formats, particularly on IBM Power processors. We extend the format to exploit single instruction multiple data (SIMD) instructions in order to utilize the vector unit, and discuss how the streamed formats...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

Scalable Computing: Practice and Experience

دوره 13 شماره

صفحات -

تاریخ انتشار 2012

New Sparse Matrix Storage Format to Improve The Performance of Total SPMV Time

نویسندگان

چکیده

منابع مشابه

Implementing Sparse Matrix-Vector Multiplication with QCSR on GPU

Accelerating Sparse Matrix Vector Multiplication on Many-Core GPUs

A new approach for sparse matrix vector product on NVIDIA GPUs

The sparse matrix vector product on GPUs

Applications of the streamed storage format for sparse matrix operations

عنوان ژورنال:

اشتراک گذاری